Sparse Coding

نویسنده

  • Rishabh Mehrotra
چکیده

Sparse modelling calls for constructing efficient representations of data as a combination of a few typical patterns (atoms) learned from the data itself. Significant contributions to the theory and practice of learning such collections of atoms (usually called Dictionaries), and of representing the actual data in terms of them, have been made thereby leading to state-of-the-art results in many signal and image processing and data analysis tasks. Sparse Coding is the process of computing the representation coefficients x based on the given signal y and the given dictionary D. Exact determination of sparsest representations proves to be an NP-hard problem [1]. This report briefly describes some of the approaches in this area, ranging from greedy algorithms to l1-optimization all the way to simultaneous learning of adaptive dictionaries and the corresponding representation vector. 1. Problem Statement: Using a dictionary 1 matrix D ∈ R that contains k atoms, dj j=1 k as its columns, a signal y (∈ R) can be represented as a sparse linear combination of these atoms, the solution of which may either be exact (y=Dx) or approximate (y≈Dx). The vector x (∈ R) expresses the representation coefficients of the signal y. The problem at hand is finding the sparsest representation, x which is the solution of either: minx x o subject to y = Dx (1) Or minx x o subject to y − Dx 2 ≤ ε (2) where . o is the 1o norm, counting the nonzero entries of a vector. 2. Solution Approaches This section briefly describes few noted approaches to this problem, followed by detailed description of one of the prominent solution (K-SVD algorithm) in the next section. 2.1 Matching Pursuit Mallat[2] proposed a greedy solution which successively approximates y with orthogonal projections on elements of D. The vector y (∈ H,Hilbert Space) can be decomposed into y = < y, gγ0 > gγ0 + Ry 1 Note: The dictionary we refer to on this report is an Overcomplete Dictionary, with k>n. Where Ry is the residual vector after approximating y in the direction of gγ0 . gγ0 being orthogonal to Ry, hence y 2 = < y, gγ0 > 2 + Ry . To minimize Ry we must choose gγ0 ∈ D such that |< y, gγ0 > | is maximum. In some cases it is only possible to find gγ0 that is almost the best in the sense that < y, gγ0 > ≥ α supγ∈τ < y, gγ0 > where α is an optimality factor that satisfies 0≤α≤ 1. A matching pursuit is an iterative algorithm that sub-decomposes the residue Ry by projecting it on a vector of D that matches Ry at its best, as was done for y. This procedure is repeated each time on the following residue that is obtained. It has been shown that it performs better than DCT based coding for low bit rates in both efficiency of coding and quality of image. The main problem with Matching Pursuit is the computational complexity of the encoder. Improvements include the use of approximate dictionary representations and suboptimal ways of choosing the best match at each iteration (atom extraction). 2.2 Orthogonal Matching Pursuit (OMP) In Pati[3] , the authors propose a refinement of the Matching Pursuit (MP) algorithm which improves convergence using an additional orthogonalization step. As compared to MP, this method performs an additional computation of k th -order model for y, y = an xn k n=1 + Rky , with = 0, n=1...k. Since the elements of D are not required to be orthogonal, to perform such an update, an auxillary model for dependence of xk+1 on xk would be required, which is given by xk+1 = bn xn k n=1 + γk with <γk, xn> = 0 for n=1...k. For a finite dictionary with N elements, OMP is guaranteed to converge to the projection onto the span of the dictionary elements in a maximum of N steps.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Face Recognition using an Affine Sparse Coding approach

Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image and video. Sparse coding has increasing attraction for image classification applications in recent years. But in the cases where we have some similar images from different classes, such as face recognition applications, different images may be classified into the same class, and hen...

متن کامل

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Rice Classification and Quality Detection Based on Sparse Coding Technique

Classification of various rice types and determination of its quality is a major issue in the scientific and commercial fields associated with modern agriculture. In recent years, various image processing techniques are used to identify different types of agricultural products. There are also various color and texture-based features in order to achieve the desired results in this area. In this ...

متن کامل

A Novel Image Denoising Method Based on Incoherent Dictionary Learning and Domain Adaptation Technique

In this paper, a new method for image denoising based on incoherent dictionary learning and domain transfer technique is proposed. The idea of using sparse representation concept is one of the most interesting areas for researchers. The goal of sparse coding is to approximately model the input data as a weighted linear combination of a small number of basis vectors. Two characteristics should b...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Structural Similarity Sparse Coding

Sparse coding theory demonstrates that the neurons in primary visual cortex form a sparse representation of natural scenes in the viewpoint of statistics. In this paper, we propose a novel sparse coding model based on structural similarity for natural image patch feature extraction. The advantage for our model is to be able to preserve structural information from a scene, which human visual per...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011